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Abstract 

The  application  of  multiscale  and  stochastic  techniques  to  the  solution  of  a  lin¬ 
earized  inverse  scattering  problem  is  presented.  This  approach  allows  for  the  explicit 
and  easy  handling  of  many  difficulties  associated  with  problems  of  this  type.  Reg¬ 
ularization  is  accomplished  via  the  use  of  a  multiscale  prior  stochastic  model  which 
offers  considerable  flexibility  for  the  incorporation  of  prior  knowledge  and  constraints. 
We  use  the  relative  error  covariance  matrix  (RECM),  introduced  in  [20],  as  a  tool  for 
quantitatively  evaluating  the  manner  in  which  data  contributes  to  the  structure  of  a 
reconstruction.  Given  a  set  of  scattering  experiments,  the  RECM  is  used  for  under¬ 
standing  and  analyzing  the  process  of  data  fusion  and  allows  us  to  define  the  space- 
varying  optimal  scale  for  reconstruction  as  a  function  of  the  nature  (resolution,  quality, 
and  distribution  of  observation  points)  of  the  available  measurement  sets.  Examples 
of  our  multiscale  inversion  algorithm  are  presented  using  the  Born  approximation  of 
an  inverse  electrical  conductivity  problem  formulated  so  as  to  illustrate  many  of  the 
features  associated  with  inverse  scattering  problems  arising  in  fields  such  as  geophysical 
prospecting  and  medical  imaging. 
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1  Introduction 

A  common  objective  of  many  applied  inverse  problems  is  the  desire  to  recover  characteristics 
of  a  medium  based  upon  observations  arising  from  the  interaction  of  transmitted  energy 
with  the  unknown  environment.  As  discussed  extensively  in  [1],  such  problems  are  found 
in  fields  as  diverse  as  medical  imaging,  nondestructive  testing,  oceanography,  remote  sens¬ 
ing,  ultrasonic  imaging,  electrical  impedance  tomography,  and  geophysical  prospecting.  The 
model  problem  considered  in  this  work  is  a  two-dimensional  inverse  electrical  conductivity 
problem  illustrated  in  Figure  1.  Here,  electromagnetic  sources  (indicated  by  the  black  cir¬ 
cles)  emit  time-harmonic  waves  into  a  lossy  medium.  These  primary  fields  are  scattered  by 
conductivity  inhomogeneities  located  in  the  darkly  shaded  rectangle  and  the  secondary  fields 
are  observed  at  one  or  both  receiver  arrays  located  on  either  vertical  edge  of  region  under 
investigation.  Based  upon  these  observations,  the  objective  of  the  inverse  problem  is  the 
reconstruction  of  the  conductivity  perturbation. 

Inverse  problems  in  general,  and  in  particular  problems  of  the  type  illustrated  in  Figure 
1,  present  a  number  of  significant  and  well- recognized  challenges.  The  objective  of  this 
paper  is  to  present  and  demonstrate  a  methodology  for  the  use  of  multiresolution  methods 
and  concepts  in  dealing  with  these  issues.  In  particular,  to  provide  some  perspective  and 
motivation  for  our  work,  we  begin  with  a  brief  look  at  several  of  these  challenges. 

A  first  major  issue  that  must  be  confronted  in  solving  inverse  problems  is  computa¬ 
tional  complexity.  Indeed,  while  nonlinearities  in  the  relationship  between  observables  and 
unknowns  present  a  considerable  computational  challenge  [3,24],  the  solution  of  linearized 
inverse  problems  -  arising  either  in  the  iterative  solution  of  the  nonlinear  problem  or  from 
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a  linearized  approximation  to  the  true  physics  of  the  problem  -  also  represents  a  compu¬ 
tationally  intensive  task.  In  particular,  the  so-called  first  Born  linearization  [14, 23]  of  the 
inverse  conductivity  problems  considered  in  this  paper  leads  to  large,  dense  sets  of  linear 
equations  due  to  the  nonlocal  nature  of  the  relationship  between  conductivity  and  mea¬ 
surements.  While  the  task  of  solving  such  systems  of  equations  certainly  is  less  intensive 
than  obtaining  a  solution  to  the  nonlinear  problem,  it  can  still  be  prohibitively  expensive 
especially  for  problems  involving  large  amounts  of  data  and  a  very  fine  discretization  of  the 
conductivity  profile.  Additionally,  the  complexity  of  these  equations  not  only  makes  them  a 
challenge  to  solve  efficiently,  but  it  also  places  a  major  impediment  in  the  way  of  other  and 
equally  important  calculations.  For  example,  the  calculation  of  error  statistics  associated 
with  the  solution  to  these  inverse  problems  is  a  considerably  more  complex  problem  than 
calculating  the  estimates  themselves. 

As  we  will  see,  the  use  of  the  wavelet  transform  to  produce  multiresolution  representations 
for  the  unknown  conductivity  field,  the  measurements,  and  the  relationship  between  these 
two  quantities  simplifies  the  analysis  considerably,  making  complex  calculations  simpler  and 
prohibitively  complex  ones  possible.  Roughly  speaking,  the  wavelet  transform  provides  a 
decomposition  in  scale  so  that  a  nonlocal  integral  measurement  of  conductivity  more-or-less 
becomes  a  local  measurement  of  a  coarse  feature  of  the  underlying  conductivity  distribution. 
As  a  result,  the  dense  equations  relating  conductivity  to  the  observed  data  become  sparse  in 
the  transform  domain  leading  to  efficient  inversion  algorithms  [19,21]. 

Moreover,  the  use  of  this  multiscale  representation  facilitates  the  solution  of  the  problem 
of  multisensor  fusion  and  in  particular,  the  combined  inversion  of  data  from  measurement 
sets  with  different  resolutions  and  spatial  coverages.  In  many  inverse  scattering  problems,  a 
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large  quantity  of  data  from  a  collection  of  scattering  experiments  may  be  used  to  generate 
the  reconstruction;  however,  the  information  conveyed  by  each  measurement  process  may  be 
far  from  complete.  Referring  to  Figure  1,  in  the  case  of  the  conductivity  problem  considered 
in  this  paper,  the  measurements  from  a  single  experiment  consist  of  the  data  obtained  over 
one  receiver  array  in  response  to  one  of  the  three  sensors  operating  at  a  particular  frequency. 
Given  the  measurements  from  many  such  experiments,  there  is  a  need  for  understanding 
precisely  how  the  different  data  vectors  contribute  unique  information  to  a  reconstruction 
and  the  manner  in  which  measurements  from  different  sources  are  merged  by  the  inversion 
routine.  For  example,  the  resolution  and  coverage  provided  by  measurements  from  the  left 
array  in  Figure  1  (with  sources  on  the  left)  are  considerably  different  from  those  obtained  from 
the  right  hand  array.  By  using  multiresolution  representations,  the  information  provided  by 
these  data  sets  are  explicitly  related  to  the  corresponding  scales  in  the  representation  of 
conductivity  making  fusion  simpler  to  perform  and  transparent  to  understand. 

A  last  major  challenge  concerning  inverse  problems  is  that  they  are  frequently  ill-posed. 
For  example,  the  problem  illustrated  in  Figure  1  is  ill-posed  to  the  extent  that  the  restriction 
of  collected  data  to  the  boundaries  of  the  medium  combined  with  the  physics  governing 
the  propagation  of  energy  through  a  lossy  medium  make  exact  inversion  a  mathematical 
impossibility  or,  at  best,  an  unacceptably  sensitive  procedure  in  which  slight  measurement 
errors  are  greatly  magnified  by  the  inversion  process.  Traditionally,  this  difficulty  is  overcome 
via  the  use  of  a  regularization  procedure  which  serves  to  stabilize  the  original  inverse  problem 
so  that  a  unique,  physically  plausible  solution  may  be  computed  [10].  Indeed,  even  if  the 
problem  is  not  ill-conditioned,  a  regularizer  may  be  incorporated  as  a  means  of  constraining 
the  reconstruction  to  reflect  prior  knowledge  concerning  the  behavior  of  this  function  [18]. 
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For  example,  it  is  common  practice  to  regularize  a  problem  so  as  to  enforce  a  degree  of 
smoothness  in  the  reconstruction  [10,18].  As  discussed  in  [20],  and  in  the  next  subsection, 
regularization  techniques  involving  such  differential  penalties  have  direct  interpretations  as 
specifying  prior  statistical  models  on  the  phenomenon  to  be  imaged.  In  principle,  this 
provides  a  basis  for  the  calculations  of  error  variances  and  for  considering  questions  such 
as  the  tradeoff  between  the  resolution  of  reconstruction  and  the  accuracy  of  the  generated 
image,  the  value  of  additional  measurement  sets,  etc.  However,  as  discussed  in  Section  2.3, 
performing  such  analysis  using  traditional  regularization  formulations  is  a  formidable  and 
often  prohibitive  task. 

By  using  a  wavelet-based  multiresolution  framework,  we  are  directly  led  to  an  alternative 
method  for  statistical  regularization  in  the  wavelet  transform  domain  that  has  a  number  of 
attractive  properties.  First,  the  class  of  multiresolution  models  available  to  us  is  extremely 
rich,  allowing  us  to  capture  a  wide  range  of  characteristics  and  constraints  in  our  prior 
models.  In  this  paper,  we  consider  a  very  special  and  at  the  same  time  highly  useful  class 
of  multiscale  prior  models,  the  so-called  fractal  priors  model  for  the  conductivity  field.  As 
shown  in  [18],  this  model  is  related  to  the  traditional  smoothness-based  statistical  regularizers 
and,  with  appropriately  chosen  parameters,  produces  estimates  with  similar  characteristics. 
Moreover,  Wornell  [26]  has  demonstrated  that  this  model  is  useful  for  representing,  self¬ 
similar  stochastic  processes  possessing  1  / /-type  power  spectra  of  the  type  that  is  commonly 
used  to  describe  natural  phenomena  [7,25].  Such  a  model  is  especially  appropriate  for 
inverse  conductivity  problems  particularly  in  contexts  such  as  geophysical  exploration  where 
fractal  models  are  frequently  used  [9,25].  Indeed,  the  fractal  priors  are  exceptionally  easy 
to  implement  [26],  and  lead  to  scale-space  algorithms  which  are  orders  of  magnitude  more 
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efficient  than  those  estimation  schemes  operating  in  real-space  using  a  regularizer  based  upon 
some  differential  operator  [18]. 

The  direct  scale-space  form  of  these  models  facilitates  the  explicit  analysis  of  the  tradeoff 
between  the  incorporation  of  fine  scale  detail  in  a  reconstruction  and  the  accuracy  in  the 
resulting  estimate.  In  particular,  the  relative  error  covariance  matrix  (RECM)  introduced 
in  [20]  provides  a  rational  basis  for  dealing  with  resolution/accuracy  tradeoffs  and  identi¬ 
fying  the  optimal  scale  to  which  the  conductivity  may  be  reconstructed  as  a  function  of 
spatial  position,  the  physics  of  the  problem,  the  prior  model,  and  the  spatial  coverage  and 
measurement  quality  of  the  data  provided  to  the  inversion  algorithm.  Moreover,  the  RECM 
provides  an  explicit  description  of  the  information  provided  by  the  various  data  sources  both 
individually  and  collectively  at  each  point  in  space  and  scale.  Thus,  we  can  determine  those 
points  at  which  active  fusion  takes  place,  i.e.  those  points  where  the  information  provided 
by  several  data  sets  together  significantly  exceeds  that  provided  by  any  one  set  individually. 
Conversely,  we  can  map  the  scale  space  regions  in  which  inversion  is  dominated  by  a  single 
data  set.  Finally,  we  can  use  the  RECM  to  assess  the  incremental  value  of  additional  sources 
of  information 

In  the  next  section,  we  provide  an  overview  and  notation  for  the  specific  inverse  problem 
of  interest.  Section  3  is  devoted  to  the  development  of  the  physical  model  relating  the 
observables  to  the  conductivity  field.  In  Section  4.1,  we  present  the  details  behind  our 
wavelet-transform  domain  representation  of  the  inverse  problem.  Section  4.2  contains  the 
description  of  the  fractal  multiresolution  statistical  regularization  formulation  and  the  tools 
for  RECM-based  analysis.  A  set  of  examples  based  upon  the  inverse  conductivity  problem 
and  which  are  illustrative  of  the  different  facets  of  our  approach  are  presented  in  Section  5. 
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Finally,  conclusions  and  directions  for  further  investigation  are  given  in  Section  6. 

2  Problem  Overview 

2.1  An  Inverse  Electrical  Conductivity  Problem 

We  use  a  two-dimensional  inverse  electrical  conductivity  problem  similar  in  structure  to  that 
considered  in  [13,24]  as  the  primary  vehicle  for  illustrating  the  multiscale,  statistically-based 
inversion  algorithms  developed  in  this  paper.  The  form  of  the  problem  is  illustrated  in  Figure 
1.  Here,  we  have  a  set  of  three  electromagnetic  line-sources  oriented  perpendicularly  to  the 
page  emitting  time- harmonic,  cylindrical  waves  into  a  medium.  The  electrical  properties  of 
this  environment  are  assumed  to  be  decomposed  into  the  sum  of  two  parts:  (1)  an  infinite, 
known,  and  constant  background  and  (2)  a  conductivity  anomaly,  g,  which  varies  as  a 
function  only  of  the  two  variable  x  and  2  and  which  is  known  to  lie  in  a  closed  and  bounded 
area  of  the  plane,  denoted  as  A  and  indicated  by  the  darkly  shaded  region  in  Figure  1.  Upon 
interaction  with  the  medium,  the  electromagnetic  energy  is  scattered  and  the  resulting  field  is 
measured  by  one  of  two  arrays  of  receivers  located  on  either  vertical  edge  of  the  conductivity 
perturbation.  Each  array  is  composed  of  a  set  of  line  receivers  all  of  which  extend  infinitely 
in  the  direction  perpendicular  to  the  page. 

We  consider  inversions  based  upon  the  data  obtained  from  a  number  of  scattering  exper¬ 
iments.  Each  such  experiment  produces  a  vector  of  measurements  comprised  of  the  observed 
scattered  field  obtained  over  a  single  receiver  array  due  to  energy  put  into  the  medium  from 
one  of  the  three  sources  operating  at  a  particular  frequency.  Of  interest  in  Section  5  are 
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inverse  problems  corresponding  to  different  subsets  of  the  nine  experiments  defined  in  Table 
1.  Here  each  source  is  capable  of  operating  at  a  high,  middle  and  low  frequency  labeled  fm, 
Imid  and  fio  respectively.  In  any  of  these  problems  we  use  K  to  denote  the  number  of 
experimental  data  sets  used,  and  we  index  these  sets  as  i  =  1,  . . . ,  K. 

As  is  described  in  Section  3,  the  use  of  the  first  Born  approximation  yields  a  linear 
relationship  between  the  vector  of  observation  associated  with  the  scattering  experiment, 
Ui,  and  a  discrete  representation  of  the  conductivity  anomaly,  g.  Thus,  the  observation  model 
used  here  is  of  the  form 

Vi  =  Tig  +  Ui  (1) 

where  Ti  is  a  matrix  encompassing  the  (linearized)  physics  and  rii  is  an  additive,  zero-mean, 

uncorrelated,  random  vector  representing  the  noise  in  the  data.  That  is,  the  noise  is 
modeled  as  rii  ~  (0,  r*/)  where  I  is  an  appropriately  sized  identity  matrix^.  It  is  often  useful 

to  work  with  the  “stacked”  system  of  data 

y  =  Tg  +  n  (2) 

where  y  contains  the  information  from  all  sensors,  i  =  1,  2,  . . . ,  K.  Thus,  we  have  y  = 

[vT  Vi  ■■  ■  VkY  1  T  and  n  defined  accordingly. 

2.2  Regularized  Inversion  and  Its  Probabilistic  Interpretation 

A  commonly  used  technique  [10,15,17]  for  solving  linear  inverse  problems  of  the  form  in  (2) 
is  to  choose  the  estimate  of  g  according  to 

g  =  argmin  \\y  -  T g\W-r  +  \\Lgf 

9 

=  argmin  11?/- Tp||^-i  +  (3) 

9 


^The  notation  x  ~  {m,P)  indicates  that  the  random  vector  x  has  mean  m  and  covariance  matrix  P. 
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where  \\x\\%[  =  x^Mx.  The  first  term  in  (3)  enforces  fidelity  to  the  data  where  the  weighting 
reflects  the  relative  quality  of  each  of  the  measurement  sets  as  measured  by  the  inverse 
of  the  noise  covariance  while  the  second  term  can  alternatively  be  viewed  as  a  regularization 
term  (in  case  (2)  is  ill-conditioned  or  underdetermined)  or  as  a  prior  statistical  model  for 
g.  In  particular,  as  discussed  in  [18],  this  penalty  term  is  equivalent  to  a  prior  model  of  the 
form^ 

Lg  =  w  w~(0,/).  (4) 

Thus,  the  nature  of  the  regularization  or  the  prior  knowledge  is  captured  in  the  structure 
of  the  matrix  L.  Frequently  in  regularization  problems,  this  matrix  is  chosen  so  that  some 
degree  of  smoothness  is  present  in  g  in  which  case  L  —  XLq  where  Lq  is  a  discrete  form  of  an 
appropriate  differential  operator  [18].  The  scalar  factor  A  is  then  used  to  provide  a  tradeoff 
between  the  influence  each  of  the  two  terms  in  (3)  exerts  on  the  reconstruction. 

The  optimization  problem  given  by  (3)  admits  a  solution  which  defines  g  in  terms  of  the 
normal  equations 

(T^n-^T  +  L^L)g  =  T^n-^y.  (5) 

As  discussed  in  [18],  this  solution,  g  can  also  be  interpreted  as  the  linear  least  squares 
estimate  (ELSE)  or  minimum  variance  estimate  of  g  given  the  noisy  measurements  in  (2) 
with  n  ^  (0, 71)  and  given  the  prior  statistics  for  g  implied  by  (4),  i.e.  g  is  zero  mean  and  has 
L^L  as  the  inverse  of  its  covariance.  Furthermore,  the  estimation  error  covariance  matrix, 
that  is,  the  covariance  of  p  —  ^  is  given  by 

E  [(»  ~  SXS  -  9)^  =  (6) 

^Note  that  we  assume  zero-mean  in  the  prior  model  for  g  only  for  notational  simplicity.  There  is  no 
complexity  added  if  we  incorporate  a  prior  mean,  e.g.  in  a  penalty  term  of  the  form  \\L{g  -  po)!!^ 
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2.3  A  Multiscale  Representation  of  the  Problem 

For  the  inverse  problems  of  interest  here,  the  in  (1)  represent  discretized  integral  operators 
corresponding  to  different  scattering  experiments.  Furthermore  if  L  is  a  discrete  version  of 
a  differential  operator,  then  obtaining  g  from  (5)  requires  the  solution  of  an  elliptic  partial 
integro-differential  equation,  or  in  discretized  form,  a  dense  set  of  linear  equations.  More¬ 
over,  computing  the  error-covariance  matrix  in  (5)  corresponds  to  the  explicit  calculation  of 
the  inverse  of  the  large,  dense  (integro-differential)  matrix  T^TZ~^T  -f  iF L,  an  even  more 
formidable  task.  Among  the  objectives  in  introducing  wavelets  and  wavelet-based  prior  mod¬ 
els  are  to  make  the  computations  in  both  (5)  and  (6)  far  simpler  to  perform  and  to  facilitate 
a  deeper  look  at  data  fusion,  reconstruction  accuracy,  and  resolution. 

In  Section  4.1  we  define  orthonormal,  discrete  wavelet  transform  operators  Wi  and  Wg 
which  transform  the  measurements,  yi  and  the  discretized  conductivity  field,  g,  into  their 
respective  wavelet  decompositions 

rii  =  WiVi 
7  =  Wg5. 

In  our  analysis  of  (1),  we  use  Wi  and  Wg  to  move  from  physical  to  scale  space  in  the  following 
manner 

Vi  =  ^iy^  =  {mTiW;){Wgg)  +  W^ni 

=  Oij  -f  r-i  (7) 

where  W*Wg  =  I  follows  from  the  orthonormality  of  the  wavelet  transformation. 

Finally,  analogously  to  the  physical  space  case,  we  define  the  stacked  systems 

7]  =  ©7  -f  u  (8) 

with  Tj  =  [rjf  V2  ■  ■  ■  VkY  ®  ^  defined  accordingly. 
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In  analogy  to  the  discussion  in  Section  2.2,  we  can  now  define  a  corresponding  regularized 
inverse  problem,  or  equivalently,  a  linear-least-squares  estimation  problem,  in  the  wavelet 
transform  domain.  Specifically,  we  wish  to  reconstruct  7  based  on  a  prior  model  for  7, 
namely  7  ~  (0,Po),  together  with  the  noisy  measurements  (8).  That  is,  the  ELSE,  7,  is 
defined  as 

7  =  arg min  ||?7- 0711^-1  +  ||7l|^_i  (9) 

■y  0 

so  that  7  satisfies  normal  equations  of  the  form 

{e^R-^e  +  Po-^)7  =  (lo) 

and  the  corresponding  error-covariance  matrix  is  given  by 

P  =  E[(7  -  7)^(7  -  7)] 

=  (e^p-^e  +  Po-^)-^  (11) 

Comparing  (3),  (5),  and  (6)  to  eqs.  (9),  (10),  and  (11),  we  see  that  the  wavelet  transfor¬ 
mation  has  left  us  with  a  formulation  of  exactly  the  same  structure  as  we  had  originally.  The 
advantages  of  this  transformation  come  from  two  important  facts.  First,  the  measurements 
operator  ©  in  the  wavelet  domain  is  far  more  sparse  than  the  operator  T.  Secondly,  as  we 
will  see  in  Section  4.2,  the  inverse  of  the  prior  covariance  Pq”^  can  be  taken  to  be  diago¬ 
nal.  In  particular,  as  we  discuss  in  Section  4.2,  a  specific  diagonal  choice  for  implies 
a  smoothness  penalty  (or  equivalently  a  fractal  prior  model)  analogous  to  that  captured  by 
L'^L  in  (3),  (5),  and  (6)  when  L  is  a  differential  operator  while  other  choices  for  the  diagonal 
matrix  Pg”^  allow  us  additional  flexibility  to  capture  a  rich  variety  of  other  regularization 
objectives  or  prior  models.  As  a  consequence  of  the  sparsity  of  0  and  the  diagonal  nature  of 
Po“\  not  only  can  (10)  be  solved  efficiently,  but  also  the  the  matrix  to  be  inverted  in  (11)  is 
sparse  with  diagonal  regularization  provided  by  thereby  facilitating  the  computation  of 
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the  inverse  and  in  particular  the  diagonal  elements  of  P  corresponding  to  the  error  variances 
in  each  of  the  wavelet  coefficients  in  out  multiscale  representation  of  g. 

3  Formulation  of  the  inverse  conductivity  problem 

In  general,  problems  of  the  type  illustrated  in  Figure  1  are  characterized  by  a  nonlinear 
relationship  between  the  data  and  the  conductivity  perturbation  [24],  As  a  first  step  in 
exploring  the  utility  of  multiscale  and  statistical  methods  for  the  solution  of  inverse  problems, 
we  consider  in  this  paper  a  linearized  form  of  the  problem  obtained  using  the  so-called  first 
Born  approximation  to  Maxwell’s  equations  [14].  In  general,  the  Born  approximation  is 
valid  only  for  conductivity  perturbations  which  are  “small”  both  in  spatial  extent  as  well 
as  magnitude  relative  to  the  background  [14].  For  the  issues  of  interest  in  this  paper,  such 
assumptions  are  not  restrictive  and  in  Section  6  we  discuss  the  manner  in  which  our  approach 
can  be  extended  easily  to  other  linearizations  as  well  as  to  the  consideration  of  the  full, 
nonlinear  inverse  problem. 

Under  the  geometric  configuration  shown  in  Figure  1  and  discussed  in  Section  2,  all 
field  quantities  and  material  parameters  are  functions  only  of  the  two  space  coordinates, 
r  =  (x,z').  As  discussed  in  [12],  for  the  data  set,  the  model  linking  the  data  to  to  the 
conductivity  under  the  Born  approximation  takes  the  form  of  a  first-kind  Fredholm  integral 
equation 

yi(rj)  =  f  Ti{rj,r')g{r')dr'  +  ni{rj)  (12) 

J  c 

where  yi{rj)  is  the  observation  at  the  point  in  the  receiver  array  associated  with  the 
experiment,  Tj(r,  r')  is  derived  from  the  constant-background  Green’s  function  for  this 
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problem  and  is  dependent  upon  the  value  of  the  background  conductivity  as  well  as  the 
source/receiver  geometry  for  this  scattering  experiments,  ^(r)  is  the  conductivity  perturba¬ 
tion,  C  is  the  region  over  which  g  is  nonzero,  and  ni{rj)  represents  the  noise  corrupting  the 
data. 

Reduction  of  the  integral  equation  (12)  to  matrix-vector  relationships  require  represen¬ 
tation  of  g  in  terms  a  a  finite  number  of  parameters.  Here,  we  use  a  method-of-moments 
approach  [16]  in  which  we  first  pixelate  C,  and  then  assume  g  to  be  constant  over  each  of 
the  Ng  subregions.  Thus  we  have: 

9{r)  =  i  9{k)Xc,{v)  (13) 

fc=i 

with  XCfc(r)  fbe  characteristic  function  of  the  set  Ck,  C  =  Ui^l^Ck,  and  g{k)  the  value  of  g 
over  Ck^.  Substitution  of  (13)  into  (12)  yields 

^9 

^)9{k)  +  ni{rj) 

k=l 

with 

Ti{vj,k)  =  [  Ti{rj,r')dr'. 

JCk 

Denoting  yi{vj)  as  simply  yi{j)  with  analogous  notation  for  n,(rj)  and  Tj(rj,i)  yields  the 
discrete  form  of  (12) 

Ng 

ViU)  =  £  Ti{j,  k)g{k)  +  ni{j)  (14) 

k=i 

which  we  shall  write  simply  as  the  matrix-vector  product  in  (1). 

^Note  that  for  ID  problems  (see  Section  5.1)  the  Ck  will  represent  intervals  of  the  real  line  while  for  in 
two- dimensions  (Section  5.2),  Ck  will  be  rectangular  regions  in  the  plane. 
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4  A  Multiscale  Formulation  of  the  Inverse  Problem 

4.1  A  Wavelet  Representation  of  g  and  y 

The  fundamental  idea  behind  the  discrete  wavelet  transform  (DWT)  is  to  decompose  signal, 
here  represented  as  a  vector,  into  a  sequence  of  increasingly  “coarser”  representations  while 
at  the  same  time  retaining  the  information  lost  in  moving  from  a  fine  to  a  coarse  scale. 
In  our  case,  we  will  be  concerned  both  with  one  and  two  dimensional  signals  where  for 
simplicity,  we  first  describe  the  wavelet  representation  and  notation  for  a  ID  signal  vector, 
a.  Following  the  wavelet  literature,  the  elements  of  this  vector  are  termed  the  finest  scale 
scaling  coefficients  associate  with  o,  and  the  vector  a  is  denoted  by  a{Ma)  indicating  that 
these  scaling  coefficients  represent  a  at  scale  M^.  The  scale  number  reflects  the  number  of 
elements  in  a.  Typically,  we  consider  vectors  of  length  2^  for  which  the  scale  number  is  the 
integer  m. 

Beginning  with  a(Ma),  a  coarser  representation  (that  is,  a  coarser  set  of  scaling  coef¬ 
ficients),  a{Ma  —  1),  is  obtained  by  first  passing  a{Ma)  through  a  low  pass,  finite  impulse 
response  (FIR)  filter,  I,  and  then  decimating  the  filtered  output  by  a  factor  of  two.  Thus, 
a{Ma  —  l)  is  “coarser”  than  a{Ma)  in  that  the  filtering/downsampling  procedure  has  removed 
the  high  frequency  structure  from  the  original  signal,  and  a{Ma  —  1)  is  half  as  long  as  a{Ma). 
The  detail  lost  in  moving  from  a(Ma)  to  a{Ma  -  1)  is  extracted  separately  by  first  high  pass 
filtering  a{Ma)  with  the  FIR  filter  h  and  then  downsampling  by  two.  This  detail  vector  is 
denoted  a{Ma  —  !)•  The  filtering  and  decimation  procedure  is  successively  applied  to  the 
coarsened  versions  of  a  resulting  in  a  sequence  of  scaling  coefficient  vectors,  a(m),  and  a 
sequence  of  detail  vectors,  a{m),  for  m  =  Ma  —  1, . . . ,  La  where  La  is  the  coarsest  scale  at 
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which  a  is  represented.  Thus,  at  scale  m,  we  have 

a(m)  =  L{m)a{m  +  1)  (15) 

a{m)  —  H{m)a{rn-\-l)  (16) 

where  L{m)  and  H{m)  are  operators  (i.e.  matrices)  representing  the  filtering  and  decimation 

operations  with  the  filters  I  and  h  respectively. 

As  described  extensively  in  [4],  the  filters  I  and  h  are  constructed  so  that  H{m)  and  L{m) 

satisfy  the  so-called  perfect-reconstruction  properties 

L{m)L*{m)  =  I  H{m)H*{m)  =  I 
L*{m)L{m)  +  H*{m)H{m)  =  I  (17) 

where  H*{m)  is  the  adjoint  of  H{m).  Using  (15),  (16)  and  (17),  we  see  that  a{m  -I-  1)  is 

obtained  from  its  coarse  scale  representation  and  the  detail  at  scale  m  via 

a(m -T  1)  =  L*(m)a(m)  d- iT*(m)Q;(m).  (18) 

Clearly,  iterating  (18)  provides  the  mechanism  for  obtaining  the  scaling  coefficients  of  a  at 
scale  m  for  La  <  m  <  Ma  using  the  coarsest  scale  scaling  coefficients  a(La)  and  intervening 
detail  coefficients  a{n)  for  La  <  n  <  m  —  1.  In  particular,  we  may  construct  an  operator,^ 
Wfl  from  H{m)  and  L{m)  which  relates  the  finest  scale  scaling  coefficients,  a  =  a{Ma),  to 
the  coarsest  scaling  coefficients,  a(La),  and  the  full  set  of  detail  coefficients  a{m)  for  scales 

m  =  La,  La  +  1,  • . . ,  Ma.  That  is,  we  may  write 

a  =  Waa  (19) 

where  a  =  [^(Ma- 1)^  . . .  a{La)'^  a(La)^]^  and  Wa  satisfies  W*Wa  =  I-  We  call  the  vector 

01  the  wavelet  transform  of  a.  The  element  of  a{m)  is  denoted  a(m,  n)  and  is  referred  to 
as  the  shift  of  a  at  scale  m.  Similarly,  a(m,  n)  represents  the  element  of  the  vector 

^We  choose  to  subscript  the  wavelet  transform  operator  here  as  Wa  to  make  explicit  that  this  is  the 
transform  for  a.  We  may  (and  in  fact  will)  use  different  wavelet  transforms  for  the  different  variables. 
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of  scaling  coefficients  at  scale  m. 

Given  this  implementation  of  the  DWT,  the  relationships  among  the  scale  space  compo¬ 
nent  in  the  decomposition  of  a  are  graphically  represented  in  the  form  of  a  lattice  as  shown 
in  Figure  3  for  the  case  of  a  wavelet  decomposition  with  l{n)  and  h{n)  of  length  4  (such  as 
the  so-called  “D4”  or  Daubechies  4-tap  wavelet  decomposition  described  in  [4].)  The  coeffi¬ 
cients  at  any  scale  all  lie  on  a  common  horizontal  line  with  the  finest  scale  coefficients  at  the 
bottom  of  the  lattice  and  the  coarsest  at  the  top.  At  the  finest  scale,  the  nodes  represent 
the  finest  set  of  scaling  coefficients.  Each  node  at  all  other  scales  contains  one  wavelet  co¬ 
efficient  except  for  the  top  scale  where  the  nodes  contain  the  coarsest  wavelet  and  coarsest 
scaling  coefiicients.  Finally,  two  nodes  are  connected  by  an  arc  if  and  only  if  there  is  a  linear 
relationship  between  the  contents  of  these  nodes  as  dictated  by  the  structure  of  the  wavelet 
transform  matrix  Wq. 

For  future  reference  we  define  some  terminology  related  to  the  lattice  structure  in  Figure 
3.  In  particular,  a  coarse  scale  node  is  said  to  impact  a  finer  scale  if  there  exists  a  strictly 
downward  path  on  the  lattice  from  the  former  to  the  latter.  Thus,  we  define  the  downward 
impact  set.  Vim,  n)  associated  with  node  (m,  n),  as  the  set  of  finest  scale  nodes  which  (m,  n) 
impacts.  Thus  in  Figure  3,  V{D)  is  comprised  of  all  nodes  marked  with  the  symbol 

The  wavelet  decomposition  of  the  scaling  coefficients  of  a  two  dimension  function  is 
obtained  by  considering  a  as  a  matrix  and  applying  one  wavelet  transform,  Wa,z,  to  the 
columns  and  a  second  wavelet  transform,  Wa,x,  to  the  rows.  In  general,  Wa,z  and  yVa,x 
may  each  use  different  I  and  h  filters  and  associated  with  each  transform  are  a  finest  and 
a  coarsest  scale  of  interest  which  we  denote  and  for  /?  G  {x,z}.  We  use  Wa  to 
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represent  the  composition  of  the  operators  yVa,x  and  'Wa,z  and  write 

a  =  Waa^ 

Furthermore,  we  note 


(w:wja  =  =  (/)a(/)  =  a 

SO  that  yV*Wa  —  I-  As  in  the  ID  case,  we  denote  a  particular  element  of  a  by  a{m,n). 

Here,  we  understand  m  and  n  to  be  two-vectors  indexing  the  scales  and  shifts  in  the  x  and 
2;  directions,  i.e.  m  =  [m^  and  n  —  [rix  respectively.  Also,  we  use  the  notation 
a{m)  to  indicate  the  collection  of  wavelet  coefficients  at  all  shifts  and  at  scale  m  =  [rrixiriz]'^. 
Unlike  the  ID  case,  the  two-dimensional  wavelet  transform  induces  a  four  dimensional  lattice 
structure  in  scale  space  with  two  dimensions  for  scale  and  two  for  shift.  Nonetheless,  we 
define  downward  impact  sets  in  the  same  manner  as  was  the  case  in  ID.  That  is,  V{m,  n)  is 
the  set  of  nodes  in  a  which  are  impacted  by  Q;(m,  n). 

The  operators  Wg  and  Wj  introduced  in  Section  2.3  are  then  constructed  using  the 
procedures  just  described.  In  general,  these  operators  may  differ  from  one  another.  For 
example,  in  the  problems  considered  here,  all  of  the  yi  are  ID  signals,  while  g  may  either  be 
a  one  or  a  two  dimensional  function.  Furthermore,  in  principle,  we  can  use  different  wavelets 
for  different  data  sets  and  can  use  a  different  number  of  scales  in  the  wavelet  decomposition 
of  each  quantity  (see  the  particular  choices  used  in  Section  5). 


4.2  Multiscale  Prior  Models 

As  discussed  in  Sections  1  and  2,  a  key  component  in  our  formulation  of  the  inverse  problem 
is  the  use  of  a  multiscale  stochastic  model  for  g  to  regularize  the  inversion  and  to  capture 
prior  information.  To  motivate  the  particular  choice  of  prior  model  used  here,  consider  the 
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case  of  a  one  dimensional  function  whose  covariance  matrix  is  with  L  representing 

first  order  differentiation.  This  implies  that  is  a  Brownian  motion  satisfying  Lg  —  w 
with  w  ~  (0,/).  As  discussed  in  [18],  work  by  Wornell  and  others  [8,22]  has  demonstrated 
that  Brownian  motions  and  other  related  fractal  processes  can  be  closely  approximated  via 
a  statistical  model  in  which  the  wavelet  coefficients  of  g  are  independent  and  distributed 
according  to 

7(m,n)  ~  (0,  (20) 

Here,  controls  the  overall  magnitude  of  the  process  and  the  parameter  g,  determines  the 
fractal  structure  of  sample  paths.  The  case  /r  =  0,  corresponds  to  g  being  white  noise  while 
as  g  increases,  the  sample  paths  of  g  show  greater  long  range  correlation  and  smoothness. 

In  addition  to  defining  the  scale- varying  probabilistic  structure  of  the  wavelet  coefficients 
of  g^  we  also  must  provide  a  statistical  model  for  the  coarsest  scale  scaling  coefficients. 
Roughly  speaking,  these  coarse  scale  coefficients  describe  the  DC  and  low  spatial  frequency 
structure  oi  g.  In  the  applications  we  consider  here,  we  assume  that  we  have  little  a  priori 
knowledge  concerning  the  large-scale  average  value  of  g.  Consequently,  we  take  g{Lg)  ~ 
{biPhgl)  where  pLg  is  some  sufficiently  large  number.  By  choosing  pi^  in  this  manner,  we 
avoid  any  bias  in  the  estimator  of  the  low  frequency  structure  of  g.  Finally,  we  note  that  for 
these  models,  the  resulting  matrix  Pq  in  (10)  is  diagonal  with  nonzero  entries  corresponding 
to  the  variances  associated  with  each  component  of  the  7. 

For  the  case  where  p  is  a  two  dimensional  function,  we  consider  the  separable  represen¬ 
tation  with 

7(m,n)  -  (0, 

for  Lg^x  <  <  ^g,x  ~  1  and  Lg^^  <  m  <  —  1-  For  m  =  Lg^x  we  take  'y{m,n) 
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with  analogous  results  holding  when  n  =  Lg^^. 

Clearly,  other  choices  of  statistics  for  the  components  of  7  may  be  appropriate  in  specific 
applications,  and  our  methodology  can  readily  accommodate  these.  The  specific  choice 
we  have  made,  leading  to  a  l//-like  fractal  model,  is  useful  both  in  its  ability  to  model 
natural  phenomena  and  because  the  successively  decreasing  variances  of  the  fine  scale  wavelet 
coefficients  control  the  incorporation  of  high  frequency  information  into  the  reconstruction. 
As  will  be, seen  in  Section  5,  this  is  precisely  the  type  of  regularization  required  for  the  inverse 
conductivity  problem.  Thus,  the  fractal  model  provides  us  with  one  physically  meaningful 
way  in  which  to  specify  the  tradeoff  which  in  turn  determines  the  way  in  which  the  resulting 
estimation  algorithm  makes  effective  use  of  the  data  only  over  those  scales  where  useful 
information  is  present. 

4.3  The  Relative  Error  Covariance  Matrix 

A  key  advantage  of  the  use  of  statistical  estimation  techniques  is  the  ability  to  produce 
not  only  the  estimate  of  7  but  also  an  indication  as  to  the  quality  of  this  reconstruction 
in  the  form  of  the  error-covariance  matrix  P  defined  in  (11)  for  the  problem  of  interest  in 
this  paper.  While  the  information  contained  in  P  is  certainly  important  for  evaluating  the 
absolute  level  of  uncertainty  associated  with  the  estimator,  in  many  cases,  it  is  more  useful 
to  understand  how  data  serves  to  reduce  uncertainty  relative  to  some  reference  level.  That 
is,  we  have  some  prior  level  of  confidence  in  our  knowledge  of  7  and  we  seek  to  comprehend 
how  the  inclusion  of  additional  data  in  our  estimate  of  7  alters  our  uncertainty  relative  to 
this  already  established  level.  In  this  section  we  define  the  relative  error  covariance  matrix 
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(RECM)  and  demonstrate  its  utility  as  a  tool  for  capturing  such  changes  in  uncertainty.  The 
analysis  of  the  RECM  in  the  wavelet  domain  is  especially  interesting  because  it  allows  for  a 
localized  characterization  of  the  manner  in  which  data  impacts  a  reconstruction.  Hence,  we 
show  how  the  RECM  provides  a  natural  means  of  evaluating  the  appropriate  level  of  detail 
as  a  function  of  position  which  can  be  supported  in  a  reconstruction  based  upon  a  given  set 
of  data  and  also  leads  directly  to  a  quantitative,  multiscale  theory  of  sensor  fusion. 

The  definition  of  the  relative  covariance  matrix  is  motivated  by  the  definition  of  the 
relative  difference  between  two  scalars  a  and  b  given  by 

1  -  t.  (21) 

a 

The  matrix  analog  to  (21)  to  be  considered  in  this  paper  is  as  follows.  Let  {1,  . . . ,  K}  denote 
the  index  set  for  the  observations  sets  yi.  For  any  subset  M  of  {1,  . . . ,  RT}  let  denote 
the  estimation  error  covariance  as  in  (11)  resulting  from  the  estimation  of  7  based  upon  the 

data  sets  corresponding  to  A  (i.e.  {yi\i  e  A}  where  for  A  =  0,  the  empty  set,  P{0}  =  Pq, 

the  prior  covariance.  The  RECM  provides  a  measure  of  the  relative  quality  of  the  estimate 
based  upon  data  in  two  sets  A  and  B  and  is  given  by 

n(A,  P)  =  /  -  Pf'^PePA^^^-  (22) 

The  definition  of  n(24,  B)  in  (22)  possesses  many  useful  properties.  First,  like  an  error 
covariance  matrix,  it  is  symmetric.  Also  if  n(A,  B)  represents  the  relative  error  covariance 
matrix  for  the  estimation  of  p,  i.e.  the  physical-space  representation  of  the  conductivity, 

then  this  is  directly  computable  from  n(A,  P)  using  the  wavelet  transform 

n(A,P)  =  >V^n(A,P)W. 

Moreover,  it  is  not  difficult  to  show  that  n(A,  P)  is  normalized  to  the  extent  that  for  A  C  B, 

0  <  n(A,P)  <  J. 
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We  note  that  in  this  case  n(A,  R)  =  0  iff  Pjg  =  Pa  which  indicates  no  additional  reduction 
in  uncertainty  results  from  augmenting  A  with  the  data  sets  in  in  P  —  A.  Also,  n(A,  B)  =  I 
if  and  only  if  Pg  =  0  i.e.  only  when  all  uncertainty  in  7  has  been  removed  when  we  use  the 
additional  information  in  B  relative  to  A. 

In  the  event  P4  is  diagonal,  the  diagonal  components  of  n(A,  B)  are  particularly  easy  to 
interpret.  Let  erf  (A)  be  the  error- variance  of  the  component  of  7  arising  from  an  estimate 
based  upon  data  from  set  A.  Then,  the  component  of  the  diagonal  of  n(A,  B)  is  just 

1  -  ^KB)/crKA)  (23) 

which  is  nothing  more  than  the  relative  size  difference  of  the  error-variance  in  the  com¬ 
ponent  of  7  based  upon  data  from  sets  A  and  B.  Note  that  the  diagonal  condition  of  Pa  is 
met  in  this  paper  when  Pa  =  Po,  since  the  wavelet  and  scaling  coefficients  are  uncorrelated 
for  the  fractal  1//  priors  used  here  as  well  as  for  many  other  physically  meaningful  prior 
models.  Thus,  the  diagonal  elements  of  n({0},P)  represent  the  decrease  in  uncertainty  due 
to  the  data  from  set  B  relative  to  the  prior  model.  Finally,  as  n({0},P)  will  be  of  interest 
frequently  in  the  remainder  of  this  work,  we  shall  abuse  notation  and  write  n({0},  B)  as 
n(P)  in  cases  when  there  will  be  be  no  confusion. 

The  quantity  n(A,  B)  represents  a  useful  tool  for  quantitatively  analyzing  the  relationship 
between  the  characteristics  of  the  data  (as  defined  by  0  and  R)  and  the  structure  of  the 
estimate  7.  Consider,  for  example,  the  case  in  which  we  wish  to  assess  the  overall  value  of  a 
collection  of  observation  vectors.  Let  n™(P)  denote  the  diagonal  element  of  the  matrix  n(P) 
corresponding  to  the  wavelet  coefficient  at  scale/shift  (m,  n)^.  Because  Pq  is  diagonal,  (23) 

®At  scale  m  =  Lj,  we  are  interested  in  both  the  wavelet  and  scaling  coefficients  of  g.  To  avoid  ambiguity, 

we  use  the  notation  Iln"  to  refer  to  the  RECM  information  for  the  coarsest  scaling  coefficient  of  g  at  shift  n. 
In  the  case  of  a  two  dimensional  g,  where  there  is  confusion,  we  shall  explicitly  write  m  =  placing 
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indicates  that  represents  the  relative  decrease  in  the  error  variance  associated  with 

the  component  in  the  wavelet  transform  of  g  at  scale/shift  {m,n).  Thus,  provide  us 

with  a  natural  way  in  which  to  define  the  appropriate  level  of  detail  which  should 

be  included  in  a  reconstruction  of  g{Mg)  at  shift  j.  Specifically,  for  each  location  j,  we 
can  examine  the  quality  of  the  information  present  at  this  point  and  at  all  coarser  scale 
“ancestors”  of  j.  Using  the  terminology  introduced  in  Section  4.1,  we  say  that  the  data 
supports  a  reconstruction  of  g{Mg,j)  at  scale  m  if  there  exists  some  node  in  the  wavelet 
lattice  of  g  at  scale  m  which  (1)  impacts  g{Mg,j)  (i.e.  for  some  shift  n,  g{Mg,j)  G  V{m,n)) 
so  that  (m,  n)  is  an  ancestor  of  {Mg,j)  and  (2)  for  which  the  data  provides  a  sufficiently  large 
quantity  of  information  regarding  the  structure  of  g  at  node  {m,n)  (i.e.  n™(R)  is  in  some 
sense  large).  Clearly,  m*(j)  is  the  finest  scale  for  which  a  node  {m,n)  may  be  found  that 
satisfies  the  above  two  criteria.  The  precise  quantification  of  “sufficiently  large”  will  depend 
upon  the  particular  application  and  on  the  structure  of  the  particular  inverse  problems  under 
investigation. 

For  the  problems  considered  here,  the  diagonal  structure  of  Pq  imply  that  0  <  Tl^{B)  < 
1  so  that  determining  whether  11™  (J5)  is  sufficiently  large  is  accomplished  by  comparing 
this  quantity  to  some  threshold,  r,  between  zero  and  one.  The  procedure  described  in  the 
preceding  paragraph  for  determining  the  optimal  scale  of  reconstruction  suggests  that  the 
only  elements  of  7  which  need  be  estimated  are  those  for  which  n™(S)  >  r.  Hence,  we  are 
a  bar  over  that  scale  index  (those  scale  indices)  associated  with  coarsest  scaling  coefficient. 


Submitted  to  IEEE  Trans.  Geoscience  and  Remote  Sensing 


23 


led  to  define 


a  truncated  version  of  7,  as  follows: 

0  U^(B)  <  T 


[TT](m,n)  —  A 


(24) 


[[7](m,n)  Otherwise 

where  [7](m,n)  is  the  component  in  the  vector  7  at  scale  m  and  shift  n.  Defining  7^  in  this 
way  ensures  that  Qr  =  W^7r  is  in  fact  the  reconstruction  of  g  which  at  each  shift  n  contains 
detail  information  at  scales  no  finer  than  m*{n). 

In  addition  to  its  use  in  assessing  the  scale  of  reconstruction  supported  by  the  information 
from  a  set  of  measurements,  if  we  consider  the  case  in  which  neither  A  nor  B  is  empty,  we 
find  that  there  are  several  ways  in  which  n(^,  B)  may  be  of  use  in  assessing  the  value  of 
fusing  information  from  multiple  data  sets  and  in  identifying  how  this  fusion  takes  place. 
For  example,  ii  A  C  B,  then  roughly  speaking,  if  n(A,  B)  is  significantly  larger  than  0, 
there  is  a  benefit  in  the  additional  information  provided  by  the  observations  in  B  —  A. 
Moreover,  n^(^,  B)  can  be  used  to  pinpoint  the  scales  and  locations  at  which  this  fusion 
has  significant  benefit®  i.e.,  those  scales  and  shifts  at  which  active  sensor  fusion  is  taking 
place.  Furthermore,  by  varying  the  sets  A  and  B,  we  can  identify  not  only  the  optimal 
scale  for  reconstruction  at  each  point  but  can  also  identify  which  data  sets  are  actively 
used  to  obtain  that  estimate.  That  is,  for  each  (m,  n),  we  can  in  principal  find  the  set 
A  C  {1,  ...,  K}  so  that  n^(M, {1,  ...,  K})  is  small  (indicating  that  sensors  not  in  A 
provide  little  additional  information  to  the  reconstruction  of  wavelet  coefficient  {m,n))  and 
so  that  for  any  C  C  A,  'n^{C,A)  is  of  significant  size  (so  that  all  of  the  sensors  actively 
contribute  to  the  reconstruction  at  this  scale  and  shift.) 

®In  this  case,  because  Pa  is  not  in  general  diagonal,  the  diagonal  elements  of  n(A,  B)  do  not  have  the 
exact  interpretation  as  the  relative  size  difference  of  the  error  variance  of  7  based  upon  data  from  A  and  B; 
however  the  size  of  these  diagonal  components  of  Ilfd.,  B)  still  provides  useful  insight  as  to  the  scales  and 
shifts  where  the  observations  from  set  B  provide  information  not  found  in  the  data  from  set  A. 
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5  Examples 

5.1  A  One  Dimensional,  Radial  Profiling  Problem 

We  begin  by  considering  a  radial  profiling  problem  similar  to  that  analyzed  by  Habashy  at. 

a/  in  [11,13].  Here,  g  is  assumed  to  vary  only  in  the  horizontal  direction  in  Figure  1  with 

the  specific  true  conductivity  profile  g  to  be  used  in  this  example  shown  as  the  solid  line  in 

Figure  5.  The  numerical  values  specifying  the  prior  model  and  the  parameters  describing 

the  background  medium  are  given  in  Table  2.  In  this  work,  the  signal  to  noise  ratio  of  the 

vector  Tji  —  0i7  +  Ui  with  Ui  ^  (0,  rfl)  and  7  ~  (0,  Pq)  is  defined  as 

SNR^  -  in0t7  _  tr(0^Po0f) 

*  Power  per  pixel  in  Vi 

where  Ng  is  the  length  of  the  vector  7  and  tr  is  the  trace  operation. 

The  objective  of  this  example  is  to  illustrate  the  utility  of  the  RECM  in  analyzing  the 
various  ways  in  which  the  data  available  to  a  reconstruction  impacts  the  estimate.  Specifi¬ 
cally,  we  explore  inversions  using  data  from  the  following  three  different  combinations  of  the 
high  and  middle  frequency  scattering  experiments  described  in  Table  1: 

Dhi  Data  collected  at  the  left  receiver  array  in  response  to  all  three  sources  operating  at 
the  the  highest  frequency  (i.e.  information  from  experiments  1-3  in  Table  1). 

Dmid  Data  collected  at  the  left  receiver  array  in  response  to  the  three  sources  operating  at 
the  middle  frequency  (i.e.  information  from  experiments  4-6  in  Table  1). 

Dhi, mid  Data  from  Dhi  U  Dmid- 


The  information  regarding  the  structure  of  g  supplied  by  Dhi  and  Dmid  is  illustrated  in 
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Figure  4.  Recall  from  (1)  that  the  observation  point  of  the  data  set, 

ViU)  =  ^)9{k)  +  ni{j).  (25) 

fc=i 

so  that  the  row  of  Ti  represents  the  map  which  takes  conductivity,  g,  into  the  element 
of  the  observation  vector.  In  Figure  4(a)  (resp.  4(b)),  a  single  row  from  kernels  associated 
with  high  (resp.  middle)  frequency  scattering  experiments  are  shown.  Specifically,  we  plot 
the  maps  associated  with  the  observation  point  in  the  middle  of  the  left  receiver  array  for 
experiments  whose  source  is  the  middle  of  the  three  line  sources.  Prom  these  illustrations, 
we  see  that  the  high  frequency  observations  are  most  sensitive  to  variations  in  g  close  to 
a;  —  0  but  provide  essentially  no  information  regarding  the  structure  of  g  far  from  the  origin. 
The  data  corresponding  to  middle-frequency  sources  better  refiect  the  behavior  of  g  away 
from  a;  =  0  but  still  are  comparatively  insensitive  to  the  conductivity  far  from  the  point  of 
observation.  Thus,  in  general  we  expect  to  obtain  a  relatively  accurate  reconstruction  of  g 
near  x  =  0  with  decreasing  fidelity  as  a  function  of  radial  position. 

In  Figure  5(a),  the  estimate  obtained  using  data  sets  1-12,  g{DHi,MiD)i  is  compared  with 
the  true  function.  Clearly,  we  are  able  to  resolve  the  left  edge  and  to  a  lesser  extent  the  mag¬ 
nitude  of  the  conductivity  anomaly  located  closest  to  the  origin.  However,  the  information 
provided  by  Dhi^mid  is  insufficient  to  obtain  an  accurate  estimate  of  the  right  edge  of  this 
structure  or  any  but  the  coarsest  information  regarding  the  rightmost  block.  As  a  means  of 
understanding  how  both  Dhi  and  Dmid  contribute  information  to  this  estimate,  in  Figures 
5(b)-(c),  g(DHi,MiD)  is  graphed  against  g{DHi)  and  ^{Dmid)  respectively.  Again,  we  see 
that  individually,  the  data  from  the  high  and  middle  frequency  sources  provide  information 
about  g  close  to  a;  =  0.  Further  from  the  origin,  g{DHi,MiD)  follows  neither  g{DHi)  nor 
9{Dmid)  so  that  some  level  of  data  fusion  must  be  taking  place  to  the  extent  that  the  pres- 
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ence  of  both  data  sets  together  yields  an  estimate  of  g  over  this  region  which  is  substantially 
different  from  that  obtained  from  either  set  alone. 

A  more  accurate  assessment  of  the  manner  in  which  this  information  is  merged  is  obtained 
by  analysis  of  the  diagonal  elements  of  the  relative  error  covariance  matrices,  n(S)  for 
B  G  {Dhi,  Dmidi  D hi, mid}-  In  Figure  6  these  quantities  are  plotted  for  scales  3,  4, 
and  6.  In  each  of  these  graphs,  n™(Diyj)  is  marked  with  a  o,  I[^{Dmid)  with  a  x,  and 
Yi^{DHi,MiD)  with  a  +.  As  there  is  strictly  more  information  in  Dhi,mid,  than  in  either 
Dhi  or  Dmid  alone,  it  is  the  case  that  all  +’s  must  lie  above  the  other  two  symbols.  In 
those  cases  where  'n^{DHi,MiD)  is  significantly  larger  than  both  and 

we  say  that  active  sensor  fusion  is  taking  place.  Indeed,  in  Figure  6(a),  this  is  the  case  for 
the  estimates  of  elements  5  -  8  of  g{Lg).  Moreover,  examination  of  Figures  6(b)-(d)  shows 
that  active  sensor  fusion  is  occurring  with  respect  to  the  estimates  of  the  wavelet  coefficients 
of  g  near  the  origin  at  scales  3,  4,  and  6.  We  have  omitted  the  RECM  plot  at  scale  5  as 
no  such  fusion  occurs  at  that  scale  in  this  example.  Finally,  Figure  6  is  instructive  to  the 
extent  that  it  demonstrates  where  the  data  do  not  support  a  reconstruction.  The  fact  that 
is  close  to  zero  at  all  scales  and  for  all  wavelet  coefficients  corresponding  to  shifts  far 
from  x  =  t)  indicates  that  the  information  in  Dhi  and  Dmid  either  alone  or  in  combination 
is  insufficient  to  reconstruct  any  detail  in  g  over  this  domain. 

This  notion  can  be  made  more  precise  by  considering  the  space-varying  optimal  scale 
of  reconstruction,  m*(j),  defined  in  Section  4.3.  In  Figure  7(a)  and  (b),  the  optimal  scale 
as  a  function  of  position  is  plotted  for  r  =  0.05  and  r  =  0.5  respectively  using  data  from 
Dhi, MID-  For  the  smaller  value  of  r,  we  see  that  as  the  x  grows  large,  the  optimal  scale 
drops  from  6  to  3  in  a  manner  quite  consistent  with  the  intuition  developed  by  examination 
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of  the  estimates.  That  is,  for  a  rather  narrow  region  near  the  origin,  the  RECM  information 
dictates  that  a  fine  scale  reconstruction  of  g  should  be  possible.  As  x  increases,  the  scale 
of  detail  to  be  included  in  g{DHi,MiD)  decreases.  For  r  =  0.50,  Figure  7(b)  shows  similar 
characteristics  to  the  r  =  0.05  case;  however,  the  more  stringent  threshold  results  in  a  more 
rapid  decrease  in  scale  as  a  function  of  distance.  Finally,  in  Figures  7(c)- (d)  the  truncated 
estimates,  gr{DHi,MiD),  defined  by  (24),  are  compared  against  g{DHi,MiD)  for  r  =  0.05  and 
r  =  0.50  respectively  showing  that  there  is  little  difference  between  the  optimal  estimate 
and  its  truncated  versions. 

The  relative  error  covariance  matrix  also  represents  a  useful  tool  for  analyzing  the  in¬ 
cremental  benefits  associated  with  the  addition  of  data  to  an  already-formed  estimate.  In 
Figure  8,  the  diagonal  elements  of  II{Dhi,  Dhi,mid)  are  displayed  for  the  coarsest  scaling  co¬ 
efficients  and  the  finest  wavelet  coefficients.  These  plots  illustrate  that  the  middle-frequency 
data  sets  contribute  new  information  to  an  estimate  based  upon  the  high-frequency  obser¬ 
vations  at  the  coarsest  scale  away  from  the  origin  and  at  the  finest  scale,  closest  to  the 
origin.  For  all  other  scales  and  shifts,  W^{Dhi,Dhi,mid)  is  essentially  zero.  We  note  that 
the  RECM  information  is  in  accord  with  the  plots  of  the  estimates  in  Figure  5  where  we  saw 
little  difference  in  the  actual  estimates  based  upon  the  different  data  sets  near  the  origin 
while  farther  from  x  =  0,  the  estimate  generate  from  both  Dhi  and  Dmid  differed  signifi¬ 
cantly  in  a  very  coarse  scale  manner  from  those  obtained  using  either  the  high  or  the  middle 
frequency  data. 

From  this  example,  we  see  that  the  relative  error  covariance  matrix  provides  new  and 
useful  insight  into  multisensor  data  fusion  not  obtainable  by  analysis  of  either  the  kernel 
functions  or  the  estimates.  The  use  of  the  RECM  is  significant  because  it  allows  for  the  for- 
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mulation  of  these  conclusions  before  any  data  are  obtained.  For  the  radial  profiling  problem 
considered  here,  one  would  conclude  that  the  data  from  the  high  and  middle  frequency  data 
sets  is  useful  for  the  recovery  of  the  conductivity  detail  structure  near  the  origin;  however, 
additional  observations  are  required  to  recover  all  but  the  coarsest  scale  information  far  from 
a:  =  0.  Additionally,  the  RECM  analysis  suggests  that  the  original  parameterization  of  g 
involving  128  degrees  of  freedom  is  clearly  excessive.  Rather,  at  a  threshold  of  r  =  0.50, 
the  data  dictates  that  only  9  elements  of  7  (the  nonzero  elements  of  %.5o{Dhi,mid))  can  be 
accurately  recovered  representing  a  93%  reduction  in  complexity  of  the  inverse  problem. 

5.2  A  Two-Dimensional,  Cross- Well  Tomography  Problem 

In  this  example  we  consider  improving  resolution  near  the  right  side  of  the  conductivity 
anomaly  by  using  observations  obtained  from  sources  located  at  the  left  and  receivers  on  the 
right  side  of  C.  This  observation  configuration  arises  quite  frequently  in  practice  especially 
in  the  fields  of  medical  imaging  and  geophysical  prospecting  [5,6,24]  and  we  term  inverse 
problems  with  this  measurement  geometry  cross-well  tomography  problems  as  they  model 
the  case  where  the  lines  x  =  b  and  x  —  100  are  taken  to  be  oil  boreholes  [24].  Additionally, 
we  now  assume  a  full  2D  problem  wherein  g  is  varies  both  in  the  x  and  the  2  directions.  The 
true  conductivity  anomaly  to  be  reconstructed  in  this  example  is  displayed  in  Figure  10(a) 
and  the  various  parameter  values  needed  for  this  experiment  are  given  on  Table  3. 

From  the  radial  profiling  problem,  to  obtain  information  regarding  the  behavior  of  g  far 
from  a:  =  0  it  is  necessary  to  probe  the  medium  with  low  frequency  energy.  Hence,  for  this 
problem,  we  augment  Dhi,mid  with  data  sets  9-12  from  Table  1.  These  data  are  generated 
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by  low  frequency  sources  located  near  the  left  side  of  the  region  of  interest  and  measured  by 
the  receiver  array  located  at  right  side.  We  denote  this  addition  collection  of  observations 
Dio-  The  structure  of  the  kernels  associated  with  this  problem  is  seen  in  Figure  9.  In 
particular  recalling  the  configuration  of  sources  and  receivers  in  Figure  1,  the  plots  in  Figure 
9  correspond  to  the  maps  taking  g  into  the  observation  at  the  mid-point  of  the  left  (in  (a) 
and  (b))  or  right  (in  (c))  receiver  array  in  response  to  input  energy  from  the  middle  source. 
Also,  as  5^  is  a  2D  function,  so  too  are  these  maps;  hence,  each  pixel  in  Figures  9(a)-(c) 
represents  the  weight  placed  on  the  corresponding  element  of  g  in  the  sum  (25)  with  darker 
colors  indicating  larger  magnitudes.  The  high  and  middle  frequency  scattering  experiments 
are  most  sensitive  to  variations  in  g  near  the  left  side  of  the  square.  The  structure  of  the  low 
frequency  kernel  with  areas  of  sensitivity  near  both  the  left  and  right  vertical  edges  suggests 
that  the  addition  of  data  from  Dio  will  improve  the  estimate  of  g  near  a:  =  0  and  allow  for 
the  determination  of  at  least  some  structure  at  the  far  side  of  the  region. 

In  Figure  10,  we  see  that  the  addition  of  the  low-frequency,  cross-well  data  does  signif¬ 
icantly  improve  the  resolution  on  the  right  side  of  A.  Figure  10(b)  (resp.  (c))  is  a  display 
of  g{DHiMiD)  (resp.  g{DHi,MiD,Lo))-  Given  only  the  high  and  medium  frequency  infor¬ 
mation,  the  anomaly  near  x  =  100  is  almost  completely  undetected;  however,  the  addition 
of  the  low  frequency  data  clearly  improves  the  ability  to  resolve  this  second  structure.  We 
do  note  that  while  both  conductivity  perturbations  are  reflected  in  the  estimates  of  g,  the 
nature  of  the  physics  of  the  problem  allows  for  only  a  comparatively  coarse-scale  or  blurred 
reconstruction  near  the  right  vertical  edge  of  the  anomaly.  In  general,  for  inverse  scat¬ 
tering  problems  of  the  type  considered  here,  one  requires  data  at  more  frequencies  and/or 
from  many  source/receiver  combinations  in  order  to  obtain  significantly  higher  resolution 
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estimates  of  such  anomalies. 

As  in  the  radial  profiling  problem,  the  relative  error  covariance  matrix  is  a  useful  tool 
in  understanding  this  sensor  fusion  problem.  In  the  cross-well  case  however  we  have  the 
additional  ability  to  analyze  the  detail  information  in  both  the  x  and  z  directions.  For  this 
experiment,  we  have  dense  observations  on  either  vertical  edge  and  a  rather  sparse  horizontal 
sampling.  Thus,  we  anticipate  that  our  ability  to  resolve  detail  in  these  two  directions  will 
be  significantly  different  and  this  difference  should  be  captured  via  the  RECM  analysis.  In 
Figure  11,  the  finest  scales  supported  in  the  reconstruction  in  both  the  x  and  z  directions 
are  plotted  as  a  function  of  position  for  r  =  0.50  for  the  two  cases  where  data  from  Dhi,mid 
and  Dhi,mid,lo  respectively  are  available  for  the  reconstruction.  From  Figure  ll(a)-(b)  we 
see  that  given  only  high  and  middle  frequency  information,  detail  in  the  reconstruction  is 
limited  to  the  region  near  x  =  0  in  both  x  and  2:  which  is  consistent  with  the  actual  estimate 
in  Figure  10(a).  Figure  ll(c)-(d)  shows  that  the  addition  of  the  low-frequency  measurements 
significantly  raises  the  level  of  detail  in  the  reconstruction  over  the  right  half  of  the  region 
of  interest  which  is  in  accord  with  the  intuition  provided  by  the  structure  of  the  kernel 
functions  associated  with  these  observations.  Specifically,  we  note  that  the  minimum  level 
of  2  oriented  detail  increases  from  2  in  Figure  11(a)  to  3  in  Figure  11(c).  Moreover,  the  finest 
scale  of  horizontal  detail  moves  from  1  to  2  in  the  area  near  the  right  vertical  edge. 

Finally,  go.^{DHi,MiD,Lo),  the  truncated  estimate  of  g  defined  in  (24),  is  plotted  in  Figure 
10(d).  In  this  case  go.^{DHi^MiD,Lo),  is  composed  of  only  75  nonzero  wavelet  coefficients  as 
opposed  to  the  256  in  the  original  corresponding  to  a  70%  reduction  in  inversion  complexity. 
Visual  comparison  of  this  reconstruction  with  the  full,  untruncated  estimate  indicates  that 
all  of  the  features  captured  in  the  optimal  estimate  are  in  fact  present  in  the  truncated 
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version  as  well. 


6  Conclusion  and  Future  Work 

In  this  paper,  we  have  presented  an  approach  to  the  solution  of  the  inverse  scattering  problem 
in  the  Born  approximation  based  upon  techniques  drawn  from  the  fields  of  multiscale  mod¬ 
eling,  wavelet  transforms,  and  statistical  estimation.  We  begin  with  a  system  of  noisy,  linear 
integral  equations  describing  the  relationship  between  the  scattered  fields  and  the  function 
to  be  estimated,  g.  After  discretization,  wavelet  methods  are  used  to  transform  the  problem 
from  real-space  to  scale-space.  A  linear  least  squares  estimator  serves  as  the  inversion  algo¬ 
rithm  and  produces  a  multiresolution  estimate  of  g,  i.e.  an  estimate  of  its  wavelet  transform 
7.  Regularization  is  achieved  via  a  statistical  model  of  7  which  also  provides  a  means  of 
capturing  any  available  prior  information  regarding  the  structure  of  g.  The  structure  of  this 
model  allows  us  considerable  flexibility  in  capturing  the  statistical  structure  of  g^  including 
the  incorporation  of  scale- varying  statistics.  To  illustrate  our  methods,  we  have  used  one  of 
many  possible  statistical  models,  namely  one  that  has  the  l//-like  fractal  structure  that  is 
often  posited  as  a  meaningful  model  for  natural  phenomena. 

Our  approach  makes  extensive  use  of  scale-space  in  the  analysis  of  linear  inverse  prob¬ 
lems.  The  relative  error  covariance  matrix  (RECM)  represents  a  quantitative  tool  for  under¬ 
standing  the  various  ways  in  which  data  from  a  multitude  of  sensors  contribute  to  the  final 
reconstruction  of  g.  We  demonstrate  a  method  for  determining  the  optimal  level  of  detail  to 
include  in  the  estimate  of  s'  as  a  function  of  spatial  location.  The  RECM  explicitly  provides 
a  means  of  capturing  the  way  in  which  this  level  is  affected  by  changing  the  information  used 
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in  the  inversion.  For  example,  the  incremental  benefits  associated  with  the  addition  of  data 
from  another  sensor  is  readily  explored  using  the  RECM.  Finally,  we  have  shown  the  use  of 
this  quantity  in  describing  the  process  of  multisensor  data  fusion  in  a  wavelet  setting. 

The  RECM  analysis  can  be  of  great  use  in  the  design  of  inversion  experiments.  Be¬ 
cause  the  RECM  is  not  a  function  of  the  data,  one  can  evaluate  and  therefore  alter  the 
experimental  configuration  prior  to  actually  collecting  data.  Moreover,  having  settled  on  the 
characteristics  of  the  data  sources,  the  RECM  can  be  used  to  understand  precisely  where  in 
a  parameterization  of  g  (i.e  for  which  degrees  of  freedom)  the  data  contributes  useful  and 
significant  information  thereby  leading  to  a  substantial  reduction  in  inversion  complexity. 

We  note  that  the  general  methodologies  presented  here  are  not  restricted  to  the  case  of 
a  Born  linearization  but  in  fact  should  prove  to  be  useful  in  the  analysis  of  other  forms  of 
the  inverse  scattering  problem  as  well.  Because  the  mathematical  description  of  the  problem 
in  the  less  restrictive  Rytov  linearization  is  virtually  identical  to  that  obtained  in  the  Born 
limit  [5,6],  our  multiscale  technique  should  find  application  in  a  variety  of  applications  where 
the  former  approach  is  most  appropriate.  Additionally,  in  recent  years,  a  variety  of  algorithms 
for  solving  the  full  nonlinear  inverse  scattering  problem  have  been  proposed  [13,24].  In  [13], 
it  is  necessary  to  expand  the  function  g  in  an  appropriate  basis  [13]  which  “is  influenced  by 
the  a  priori  information  available  on  the  unknown  profile.”  The  use  of  a  wavelet  basis  and 
fractal-type  of  regularizer  would  in  this  case  be  rather  natural  alternatives.  Moreover,  the 
algorithms  in  [13,24]  employ  variants  of  Newton’s  method  to  execute  the  nonlinear  inversion. 
As  discussed  in  [19],  at  each  step  of  the  algorithm  one  is  faced  with  solving  a  linear  systems 
which  essentially  has  the  same  form  as  the  least  squares  problem  considered  in  this  paper. 
Thus,  by  considering  a  multiscale  form  of  the  problem,  a  RECM  analysis  could  be  used  to 


Submitted  to  IEEE  Trans.  Geoscience  and  Remote  Sensing 


33 


determine  which  coefficients  in  the  wavelet  expansion  should  be  determined  at  each  iteration. 
Given  that  relatively  few  such  elements  were  recoverable  for  the  examples  considered  in  this 
paper,  the  use  of  a  multiscale  formalism  in  connection  with  these  nonlinear  problems  could 
provide  significant  computational  savings. 

Einally,  although  not  considered  extensively  in  this  work,  the  multiscale,  statistically 
based  inversion  algorithms  admit  highly  efficient  implementations.  As  discussed  by  Beylkin 
et.  al  in  [2],  wavelet  transforms  of  many  operator  matrices,  including  those  arising  in  the 
problem  studied  here,  contain  very  few  significant  elements  so  that  zeroing  the  remainder 
lead  to  sparse  matrices  0j.  The  sparsity  of  0^  combined  with  the  diagonal  structure  of  Pq 
imply  that  highly  efficient,  iterative  algorithms  such  as  LSQR  [21]  can  be  used  to  solve  the 
normal  equations.  Indeed,  in  [19],  we  consider  the  development  of  a  modified  form  of  LSQR 
designed  for  the  efficient  and  stable  computation  of  7  as  well  as  arbitrary  elements  in  the 


error  covariance  and  relative  error  covariance  matrices. 
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Figure  1:  Configuration  of  inverse  conductivity  problem.  The  electromagnetic  sources  (indi¬ 
cated  by  the  black  circles)  emit  time-harmonic  waves  into  a  lossy  medium  which  subsequently 
are  scattered  by  conductivity  inhomogeneities  located  in  the  darkly  shaded  rectangle,  C.  The 
secondary  fields  are  observed  at  one  or  both  receiver  arrays  located  on  either  vertical  edge 
of  region  under  investigation.  Based  upon  these  observations,  the  objective  of  the  inverse 
problem  is  the  reconstruction  of  the  conductivity  perturbation. 
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a(Ma  -1 )  “(Ma  -2 )  a(L  a ) 

Figure  2:  Wavelet  transform  pyramid. 


Submitted  to  IEEE  Trans.  Geoscience  and  Remote  Sensing 


39 


cx(M-2,n) 

/ 


=  Element  of  a  at  scale  M  and  shift  j 


Figure  3:  A  sample  lattice  strncture  corresponding  to  a  D4  wavelet  transform.  The  finest 
scale  is  taken  as  Ma  while  the  coarsest  is  La- 
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(a)  Typical  structure  associated  with  (b)  Typical  structure  associated  with 
high  frequency  kernel  middle  frequency  kernel 


Figure  4:  Typical  structure  of  kernel  functions  used  in  the  reconstruction  of  g  for  the  radial 
profiling  example. 
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(a)  g  (solid  line)  versus  g{DHi,MiD)  (b)  g{DHi,Mw)  (solid  line)  versus 
(dashed  line)  9{Dhi)  (dashed  line) 


(c)  g{DHi,MiD)  (solid  line)  versus 
^Dmid)  (dashed  line) 


Figure  5:  Estimates  of  g  using  various  combinations  of  high  and  middle  frequency  data.  We 
note  that  in  all  cases,  the  measurements  provide  sufficient  information  to  reconstruct  only 
those  features  of  g  near  x  —  t).  At  points  further  from  the  origin,  only  the  coarsest  scale 
characteristics  of  g  are  resolvable.  Moreover,  as  g{DHi,MiD)  is  significantly  different  from 
both  giDni)  and  g{DMiD)  we  conclude  that  some  type  of  sensor  fusion  is  occurring  over  the 
region  far  from  x  =  0. 
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Shift,  n 


Shift,  n 


(a)  RECM  information  for  coarsest  (b)  RECM  information  for  wavelet  co¬ 
scaling  coefficients  (i.e.  scale  3)  efficients  at  scale  3 


Shift,  n  Shift,  n 

(c)  RECM  information  for  wavelet  co-  (d)  RECM  information  for  wavelet  co¬ 
efficients  at  scale  4  efficients  at  scale  6 


Figure  6:  Diagonal  elements  of  relative  error  covariances  for  three  radial  profiling  experi¬ 
ments.  In  all  cases,  the  symbol  “-f”  corresponds  to  “o”  to  II{Dhi)  and  “x” 

to  n(£>M7D)-  From  (a)  we  see  a  significant  level  of  sensor  fusion  taking  place  with  respect 
to  the  estimates  of  the  coarsest  scale  scaling  coefficients  far  from  the  origin  x  =  0.  From 
(b)-(d),  we  conclude  that  accurate  reconstruction  of  the  detail  components  of  g  is  limited 
to  shifts  close  to  x  =  0. 
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Distance  from  borehole  Distance  from  borehole 


(a)  The  optimal  scale  of  reconstruction  (b)  The  optimal  scale  of  reconstruction 
as  a  function  of  position  at  scale  Mg  =  as  a  function  of  position  at  scale  Mg  = 

7  for  a  threshold  value  of  r  =  0.05.  7  for  a  threshold  value  of  r  =  0.50. 


Distance  from  borehole  Distance  from  borehole 

(c)  g  (solid  line)  vs  po.os  (dashed  line)  (d)  g  (solid  line)  vs  ^0,50  (dashed  line) 

Figure  7:  Maps  of  the  optimal  scale  of  reconstruction  and  the  associated  estimates  of  g  for 
threshold  values  r  e  {0.05,  0.50}.  These  illustrations  provide  a  quantitative  verification  of 
the  intuition  that  resolution  in  the  inversion  should  drop  as  a  function  of  distance  from  the 
origin.  In  (c)  and  (d),  the  plots  of  g  against  ^0.05  and  ^0.50  respectively  show  that  little  is 
lost  in  reducing  the  complexity  of  the  model  by  eliminating  degrees  of  freedom  about  which 
the  data  provides  little  or  no  information. 
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Figure  8:  The  incremental  reduction  in  uncertainty  obtained  by  adding  data  from  the  middle 
frequency  observation  to  an  estimate  based  upon  the  high  frequency  measurement  sources. 
In  accordance  with  Figure  6(a)  we  see  significant  benefits  associated  with  determination  of 
both  the  coarsest  scale  structure  of  g  far  from  the  origin  as  well  as  the  finest  scale  structure 
closest  to  a;  =  0. 
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(a)  Structure  of  typical  high  frequency 
kernel 


(b)  Structure  of  typical  middle  fre¬ 
quency  kernel 


0  20  40  60  80  100 

X 

(c)  Structure  of  typical  low  frequency 
kernel 


Figure  9:  Typical  structure  of  kernel  functions  used  in  the  reconstruction  of  g  for  the  cross¬ 
well  tomography  example.  Each  image  corresponds  to  map  taking  conductivity  to  the  mea¬ 
surement  obtained  at  the  center  of  the  left  (in  (a)  and  (b))  or  right  (in  (c))  receiver  array  in 
response  to  excitation  from  the  middle  source  with  darker  shades  indicating  larger  values. 
As  in  the  radial  profiling  example,  the  high  and  low  frequency  kernels  are  most  sensitive  to 
variations  in  g  near  the  left  edge  of  the  square.  The  low  frequency  data  should  aid  in  the 
reconstruction  of  g  near  the  either  vertical  edge. 
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(a)  Conductivity  structure  to  be  recon¬ 
structed 


(t*)  g{DHI,MID) 


(c)  g{DHI,MID,Lo)  (d)  go.^{DHI,MID,Lo) 


Figure  10:  In  (b)-(d)  the  estimates  of  g  in  (a)  are  displayed  using  various  combinations  of 
high,  middle  and  low  frequency  data.  From  (b),  the  high  and  medium  frequency  information 
provides  insufficient  information  to  reconstruct  the  anomaly  near  x  =  100.  As  seen  in  (c), 
the  addition  of  the  low  frequency,  cross- well  data  sets  clearly  improves  the  ability  to  resolve 
this  second  structure.  The  truncated  estimate  go.5(}{DHi,MiD,Lo)-  Note  that  there  is  little 
difference  between  this  function,  composed  of  75  non-zero  elements  in  the  wavelet  transform 
domain  and  the  optimal  estimate  g{DHi^MiD,Lo)  in  (c)  which  has  256  degrees  of  freedom. 
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(a)  The  finest  scale  of  to  which  2- 
oriented  detail  can  be  reconstructed  at 
r  =  0.5  given  only  high  and  middle 
frequency  data. 


(b)  The  finest  scale  of  to  which  x- 
oriented  detail  can  be  reconstructed  at 
T  =  0.5  given  only  high  and  middle 
frequency  data. 


(c)  The  finest  scale  of  to  which  2- 
oriented  detail  can  be  reconstructed  at 
T  =  0.5  given  high,  middle,  and  low 
frequency  data. 


(d)  The  finest  scale  of  to  which  x- 
oriented  detail  can  be  reconstructed  at 
r  =  0.5  given  high,  middle,  and  low 
frequency  data. 


Figure  11:  Maps  of  the  optimal  scale  of  reconstruction  for  the  2  and  x  components  of  detail 
for  the  threshold  value  r  =  0.5.  The  maps  verify  of  the  intuition  that  the  low-frequency, 
cross-well  data  provides  improved  resolution  especially  in  the  vicinity  of  the  right  vertical 
ede:e. 
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Experiment 

number 

Source 

Position 

Frequency 
of  source  (Hz) 

Receiver 

Array 

1,2,3 

T,M,B 

fni  =  398 

Left 

4,5,6 

T,M,B 

Imid  =  119 

Left 

7,8,9 

T,M,B 

fio  =  6 

Right 

Table  1:  Data  set  definitions  for  observation  processes  of  interest  in  the  paper.  The  abbrevi¬ 
ations  in  the  column  labeled  “Source  Position”  correspond  to  the  Top,  Middle,  and  Rottom 
line  sources  in  Figure  1 
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Parameter 

Value 

Wavelet 

Daubechies  6-tap 

M, 

6 

L, 

3 

/i 

1 

1 

Vl„ 

0.5 

SNU^  for  Dhi 

200 

SNK'^  for  Dmid 

400 

Background  conductivity 

1  S/m 

Table  2:  Parameters  for  radial  profiling  problem. 
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Parameter 

Value 

Parameter 

Value 

2;  Wavelet 

Daubechies  6-tap 

X  Wavelet 

Daubechies  2-tap 

^q,z 

4 

2 

2 

Lq,x 

1 

1 

l^X 

1 

1 

1 

PLg,Z 

V2 

PLc,X 

72 

SNU^  for  Dhi 

250 

SNK^  for 

500 

SNK^  for  Dlo 

1000 

Background  conductivity 

1  S/m 

Table  3:  Parameters  for  cross- well  tomography  problem 


